Correlation Analysis of Spatial Time Series Datasets: A Filter-and-Refine Approach
نویسندگان
چکیده
A spatial time series dataset is a collection of time series, each referencing a location in a common spatial framework. Correlation analysis is often used to identify pairs of potentially interacting elements from the cross product of two spatial time series datasets. However, the computational cost of correlation analysis is very high when the dimension of the time series and the number of locations in the spatial frameworks are large. The key contribution of this paper is the use of spatial autocorrelation among spatial neighboring time series to reduce computational cost. A filter-and-refine algorithm based on coning, i.e. grouping of locations, is proposed to reduce the cost of correlation analysis over a pair of spatial time series datasets. Cone-level correlation computation can be used to eliminate (filter out) a large number of element pairs whose correlation is clearly below (or above) a given threshold. Element pair correlation needs to be computed for remaining pairs. Using experimental studies with Earth science datasets, we show that the filter-and-refine approach can save a large fraction of the computational cost, particularly when the minimal correlation threshold is high.
منابع مشابه
Spectral-spatial classification of hyperspectral images by combining hierarchical and marker-based Minimum Spanning Forest algorithms
Many researches have demonstrated that the spatial information can play an important role in the classification of hyperspectral imagery. This study proposes a modified spectral–spatial classification approach for improving the spectral–spatial classification of hyperspectral images. In the proposed method ten spatial/texture features, using mean, standard deviation, contrast, homogeneity, corr...
متن کاملA Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach
In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...
متن کاملMissing data imputation in multivariable time series data
Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...
متن کاملFast SFFS-Based Algorithm for Feature Selection in Biomedical Datasets
Biomedical datasets usually include a large number of features relative to the number of samples. However, some data dimensions may be less relevant or even irrelevant to the output class. Selection of an optimal subset of features is critical, not only to reduce the processing cost but also to improve the classification results. To this end, this paper presents a hybrid method of filter and wr...
متن کاملApplication of multivariate techniques in-line with spatial regionalization of AOD over Iran
Application of multivariate techniques in-line with spatial regionalization of AOD over Iran Introduction Models, satellites and terrestrial datasets have been used to detect and characterize aerosol. Nontheless, micoscale classification using remote sensing parameters considers as a deficiency. Thus, regionalizion and modeling aerosol without regard to political boundaries or a specific s...
متن کامل